Le TLF dans ISTEX

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Abstraction and approximate decision-theoretic planning

Identifieur interne : 000118 ( Main/Exploration ); précédent : 000117; suivant : 000119

Abstraction and approximate decision-theoretic planning

Auteurs : Richard Dearden [Canada] ; Craig Boutilier [Canada]

Source :

RBID : ISTEX:A82685ED61F587C143A9F32E055C63BDAEF7DE44

Descripteurs français

English descriptors

Abstract

Markov decision processes (MDPs) have recently been proposed as useful conceptual models for understanding decision-theoretic planning. However, the utility of the associated computational methods remains open to question: most algorithms for computing optimal policies require explicit enumeration of the state space of the planning problem. We propose an abstraction technique for MDPs that allows approximately optimal solutions to be computed quickly. Abstractions are generated automatically, using an intensional representation of the planning problem (probabilistic strips rules) to determine the most relevant problem features and optimally solving a reduced problem based on these relevant features. The key features of our method are: abstractions can be generated quickly; the abstract solution can be applied directly to the original problem; and the loss of optimality can be bounded. We also describe methods by which the abstract solution can be viewed as a set of default reactions that can be improved incrementally, and used as a heuristic for search-based planning or other MDP methods. Finally, we discuss certain difficulties that point toward other forms of aggregation for MDPs.

Url:
DOI: 10.1016/S0004-3702(96)00023-9


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Abstraction and approximate decision-theoretic planning</title>
<author>
<name sortKey="Dearden, Richard" sort="Dearden, Richard" uniqKey="Dearden R" first="Richard" last="Dearden">Richard Dearden</name>
</author>
<author>
<name sortKey="Boutilier, Craig" sort="Boutilier, Craig" uniqKey="Boutilier C" first="Craig" last="Boutilier">Craig Boutilier</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:A82685ED61F587C143A9F32E055C63BDAEF7DE44</idno>
<date when="1997" year="1997">1997</date>
<idno type="doi">10.1016/S0004-3702(96)00023-9</idno>
<idno type="url">https://api.istex.fr/document/A82685ED61F587C143A9F32E055C63BDAEF7DE44/fulltext/pdf</idno>
<idno type="wicri:Area/Main/Corpus">000356</idno>
<idno type="wicri:Area/Main/Curation">000321</idno>
<idno type="wicri:Area/Main/Exploration">000118</idno>
<idno type="wicri:explorRef" wicri:stream="Main" wicri:step="Exploration">000118</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Abstraction and approximate decision-theoretic planning</title>
<author>
<name sortKey="Dearden, Richard" sort="Dearden, Richard" uniqKey="Dearden R" first="Richard" last="Dearden">Richard Dearden</name>
<affiliation wicri:level="4">
<country>Canada</country>
<wicri:regionArea>Department of Computer Science, University of British Columbia, Vancouver, BC</wicri:regionArea>
<orgName type="university">Université de la Colombie-Britannique</orgName>
<placeName>
<settlement type="city">Vancouver</settlement>
<region type="state">Colombie-Britannique </region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Canada</country>
</affiliation>
</author>
<author>
<name sortKey="Boutilier, Craig" sort="Boutilier, Craig" uniqKey="Boutilier C" first="Craig" last="Boutilier">Craig Boutilier</name>
<affiliation wicri:level="4">
<country>Canada</country>
<wicri:regionArea>Department of Computer Science, University of British Columbia, Vancouver, BC</wicri:regionArea>
<orgName type="university">Université de la Colombie-Britannique</orgName>
<placeName>
<settlement type="city">Vancouver</settlement>
<region type="state">Colombie-Britannique </region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Canada</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Artificial Intelligence</title>
<title level="j" type="abbrev">ARTINT</title>
<idno type="ISSN">0004-3702</idno>
<imprint>
<publisher>ELSEVIER</publisher>
<date type="published" when="1997">1997</date>
<biblScope unit="volume">89</biblScope>
<biblScope unit="issue">1–2</biblScope>
<biblScope unit="page" from="219">219</biblScope>
<biblScope unit="page" to="283">283</biblScope>
</imprint>
<idno type="ISSN">0004-3702</idno>
</series>
<idno type="istex">A82685ED61F587C143A9F32E055C63BDAEF7DE44</idno>
<idno type="DOI">10.1016/S0004-3702(96)00023-9</idno>
<idno type="PII">S0004-3702(96)00023-9</idno>
<idno type="ArticleID">96000239</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0004-3702</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Abstraction</term>
<term>Approximation</term>
<term>Decision theory</term>
<term>Execution</term>
<term>Heuristics</term>
<term>Markov decision processes</term>
<term>Planning</term>
<term>Search</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Perquisition</term>
</keywords>
</textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Markov decision processes (MDPs) have recently been proposed as useful conceptual models for understanding decision-theoretic planning. However, the utility of the associated computational methods remains open to question: most algorithms for computing optimal policies require explicit enumeration of the state space of the planning problem. We propose an abstraction technique for MDPs that allows approximately optimal solutions to be computed quickly. Abstractions are generated automatically, using an intensional representation of the planning problem (probabilistic strips rules) to determine the most relevant problem features and optimally solving a reduced problem based on these relevant features. The key features of our method are: abstractions can be generated quickly; the abstract solution can be applied directly to the original problem; and the loss of optimality can be bounded. We also describe methods by which the abstract solution can be viewed as a set of default reactions that can be improved incrementally, and used as a heuristic for search-based planning or other MDP methods. Finally, we discuss certain difficulties that point toward other forms of aggregation for MDPs.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Canada</li>
</country>
<region>
<li>Colombie-Britannique </li>
</region>
<settlement>
<li>Vancouver</li>
</settlement>
<orgName>
<li>Université de la Colombie-Britannique</li>
</orgName>
</list>
<tree>
<country name="Canada">
<region name="Colombie-Britannique ">
<name sortKey="Dearden, Richard" sort="Dearden, Richard" uniqKey="Dearden R" first="Richard" last="Dearden">Richard Dearden</name>
</region>
<name sortKey="Boutilier, Craig" sort="Boutilier, Craig" uniqKey="Boutilier C" first="Craig" last="Boutilier">Craig Boutilier</name>
<name sortKey="Boutilier, Craig" sort="Boutilier, Craig" uniqKey="Boutilier C" first="Craig" last="Boutilier">Craig Boutilier</name>
<name sortKey="Dearden, Richard" sort="Dearden, Richard" uniqKey="Dearden R" first="Richard" last="Dearden">Richard Dearden</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/TlfNancyV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000118 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000118 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    TlfNancyV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:A82685ED61F587C143A9F32E055C63BDAEF7DE44
   |texte=   Abstraction and approximate decision-theoretic planning
}}

Wicri

This area was generated with Dilib version V0.6.39.
Data generation: Tue Mar 12 10:51:38 2024. Site generation: Tue Mar 12 10:52:26 2024